Compared with typical document images text in video presents challenges because of low resolution, complex background, lighting variation, and unrestricted pose, shape and color. A method to automatically localize texts in the compressed domain and spatial domain is presented. 与文档图像相比较,视频中的文本提取由于其较低的分辨率、复杂的背景、照明的变化、和位置、形状与颜色的不确定而具有很大的挑战性。